A novel approach to locomotion learning: Actor-Critic architecture using central pattern generators and dynamic motor primitives

نویسندگان

  • Cai Li
  • Robert Lowe
  • Tom Ziemke
چکیده

In this article, we propose an architecture of a bio-inspired controller that addresses the problem of learning different locomotion gaits for different robot morphologies. The modeling objective is split into two: baseline motion modeling and dynamics adaptation. Baseline motion modeling aims to achieve fundamental functions of a certain type of locomotion and dynamics adaptation provides a "reshaping" function for adapting the baseline motion to desired motion. Based on this assumption, a three-layer architecture is developed using central pattern generators (CPGs, a bio-inspired locomotor center for the baseline motion) and dynamic motor primitives (DMPs, a model with universal "reshaping" functions). In this article, we use this architecture with the actor-critic algorithms for finding a good "reshaping" function. In order to demonstrate the learning power of the actor-critic based architecture, we tested it on two experiments: (1) learning to crawl on a humanoid and, (2) learning to gallop on a puppy robot. Two types of actor-critic algorithms (policy search and policy gradient) are compared in order to evaluate the advantages and disadvantages of different actor-critic based learning algorithms for different morphologies. Finally, based on the analysis of the experimental results, a generic view/architecture for locomotion learning is discussed in the conclusion.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Humanoids Learning to Walk: A Natural CPG-Actor-Critic Architecture

The identification of learning mechanisms for locomotion has been the subject of much research for some time but many challenges remain. Dynamic systems theory (DST) offers a novel approach to humanoid learning through environmental interaction. Reinforcement learning (RL) has offered a promising method to adaptively link the dynamic system to the environment it interacts with via a reward-base...

متن کامل

Applying the Episodic Natural Actor-Critic Architecture to Motor Primitive Learning

In this paper, we investigate motor primitive learning with the Natural Actor-Critic approach. The Natural Actor-Critic consists out of actor updates which are achieved using natural stochastic policy gradients while the critic obtains the natural policy gradient by linear regression. We show that this architecture can be used to learn the “building blocks of movement generation”, called motor ...

متن کامل

Crawling Posture Learning in Humanoid Robots using a Natural-Actor-Critic CPG Architecture

In this article, a four-cell CPG network, exploiting sensory feedback, is proposed in order to emulate infant crawling gaits when utilized on the NAO robot. Based on the crawling model, the positive episodic natural-actor-critic architecture is applied to learn a proper posture of crawling on a simulated NAO. By transferring the learned results to the physical NAO, the transferability from simu...

متن کامل

Reinforcement Learning for Biped Robot

Animal rhythmic movements such as locomotion are considered to be controlled by neural circuits called central pattern generators (CPGs), which generate oscillatory signals. Motivated by such a biological mechanisms, rhythmic movements controlled by CPG has been studied. As an autonomous learning framework for the CPG controller, we propose an reinforcement learning method , which is called the...

متن کامل

A novel actor-critic-identifier architecture for approximate optimal control of uncertain nonlinear systems

An online adaptive reinforcement learning-based solution is developed for the infinite-horizon optimal control problem for continuous-time uncertain nonlinear systems. A novel actor–critic–identifier (ACI) is proposed to approximate the Hamilton–Jacobi–Bellman equation using three neural network (NN) structures—actor and critic NNs approximate the optimal control and the optimal value function,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2014